91 research outputs found
Efficient Spectral Sketches for the Laplacian and its Pseudoinverse
In this paper we consider the problem of efficiently computing
-sketches for the Laplacian and its pseudoinverse. Given a Laplacian
and an error tolerance , we seek to construct a function such
that for any vector (chosen obliviously from ), with high probability
where is
either the Laplacian or its pseudoinverse. Our goal is to construct such a
sketch efficiently and to store it in the least space possible.
We provide nearly-linear time algorithms that, when given a Laplacian matrix
and an error tolerance ,
produce -size sketches of both and its
pseudoinverse. Our algorithms improve upon the previous best sketch size of
for sketching the Laplacian form by Andoni
et al (2015) and for sketching the Laplacian pseudoinverse
by Batson, Spielman, and Srivastava (2008).
Furthermore we show how to compute all-pairs effective resistances from
size sketch in time.
This improves upon the previous best running time of
by Spielman and Srivastava (2008).Comment: Accepted to SODA 2018; v2 fixes a small bug in the proof of lemma 3.
This does not affect correctness of any of our result
Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression
In this paper, we obtain improved running times for regression and top
eigenvector computation for numerically sparse matrices. Given a data matrix where every row has
and numerical sparsity at most , i.e. , we provide faster algorithms for these problems in many
parameter settings.
For top eigenvector computation, we obtain a running time of where is the relative
gap between the top two eigenvectors of and is the stable rank
of . This running time improves upon the previous best unaccelerated running
time of as it is always the case that
and .
For regression, we obtain a running time of where is the smallest eigenvalue of .
This running time improves upon the previous best unaccelerated running time of
. This result expands the regimes where regression
can be solved in nearly linear time from when to when .
Furthermore, we obtain similar improvements even when row norms and numerical
sparsities are non-uniform and we show how to achieve even faster running times
by accelerating using approximate proximal point [Frostig et. al. 2015] /
catalyst [Lin et. al. 2015]. Our running times depend only on the size of the
input and natural numerical measures of the matrix, i.e. eigenvalues and
norms, making progress on a key open problem regarding optimal running
times for efficient large-scale learning.Comment: To appear in NIPS 201
Coordinate Methods for Accelerating Regression and Faster Approximate Maximum Flow
We provide faster algorithms for approximately solving
regression, a fundamental problem prevalent in both combinatorial and
continuous optimization. In particular, we provide accelerated coordinate
descent methods capable of provably exploiting dynamic measures of coordinate
smoothness, and apply them to regression over a box to give
algorithms which converge in iterations at a rate. Our algorithms
can be viewed as an alternative approach to the recent breakthrough result of
Sherman [She17] which achieves a similar runtime improvement over classic
algorithmic approaches, i.e. smoothing and gradient descent, which either
converge at a rate or have running times with a worse
dependence on problem parameters. Our runtimes match those of [She17] across a
broad range of parameters and achieve improvement in certain structured cases.
We demonstrate the efficacy of our result by providing faster algorithms for
the well-studied maximum flow problem. Directly leveraging our accelerated
regression algorithms imply a runtime to compute an -approximate maximum
flow for an undirected graph with edges and vertices, generically
improving upon the previous best known runtime of
in [She17] whenever the graph is slightly
dense. We further design an algorithm adapted to the structure of the
regression problem induced by maximum flow obtaining a runtime of
, where is the
squared norm of the congestion of any optimal flow. Moreover, we show
how to leverage this result to achieve improved exact algorithms for maximum
flow on a variety of unit capacity graphs. We hope that our work serves as an
important step towards achieving even faster maximum flow algorithms.Comment: A preliminary version appeared in FOCS 2018, with an error in the
accelerated coordinate descent proof. Originally we claimed for our approximate maximum flow runtime; this version
obtains . The regression results
have been substantially improved, with dependence on column sparsity
(formerly
Path Finding I :Solving Linear Programs with \~O(sqrt(rank)) Linear System Solves
In this paper we present a new algorithm for solving linear programs that
requires only iterations to solve a linear program
with constraints, variables, and constraint matrix , and bit
complexity . Each iteration of our method consists of solving
linear systems and additional nearly linear time computation.
Our method improves upon the previous best iteration bound by factor of
for methods with polynomial time computable
iterations and by for methods which solve
at most linear systems in each iteration. Our method is
parallelizable and amenable to linear algebraic techniques for accelerating the
linear system solver. As such, up to polylogarithmic factors we either match or
improve upon the best previous running times in both depth and work for
different ratios of and .
Moreover, our method matches up to polylogarithmic factors a theoretical
limit established by Nesterov and Nemirovski in 1994 regarding the use of a
"universal barrier" for interior point methods, thereby resolving a
long-standing open question regarding the running time of polynomial time
interior point methods for linear programming
Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems
In this paper we show how to accelerate randomized coordinate descent methods
and achieve faster convergence rates without paying per-iteration costs in
asymptotic running time. In particular, we show how to generalize and
efficiently implement a method proposed by Nesterov, giving faster asymptotic
running times for various algorithms that use standard coordinate descent as a
black box. In addition to providing a proof of convergence for this new general
method, we show that it is numerically stable, efficiently implementable, and
in certain regimes, asymptotically optimal.
To highlight the computational power of this algorithm, we show how it can
used to create faster linear system solvers in several regimes:
- We show how this method achieves a faster asymptotic runtime than conjugate
gradient for solving a broad class of symmetric positive definite systems of
equations.
- We improve the best known asymptotic convergence guarantees for Kaczmarz
methods, a popular technique for image reconstruction and solving
overdetermined systems of equations, by accelerating a randomized algorithm of
Strohmer and Vershynin.
- We achieve the best known running time for solving Symmetric Diagonally
Dominant (SDD) system of equations in the unit-cost RAM model, obtaining an O(m
log^{3/2} n (log log n)^{1/2} log (log n / eps)) asymptotic running time by
accelerating a recent solver by Kelner et al.
Beyond the independent interest of these solvers, we believe they highlight
the versatility of the approach of this paper and we hope that they will open
the door for further algorithmic improvements in the future
Efficient Inverse Maintenance and Faster Algorithms for Linear Programming
In this paper, we consider the following inverse maintenance problem: given
and a number of rounds , we receive a
diagonal matrix at round and we wish to maintain an
efficient linear system solver for under the assumption
does not change too rapidly. This inverse maintenance problem is the
computational bottleneck in solving multiple optimization problems. We show how
to solve this problem with preprocessing time
and amortized time per round, improving upon previous
running times for solving this problem.
Consequently, we obtain the fastest known running times for solving multiple
problems including, linear programming and computing a rounding of a polytope.
In particular given a feasible point in a linear program with variables,
constraints, and constraint matrix , we show
how to solve the linear program in time
. We achieve our results
through a novel combination of classic numerical techniques of low rank update,
preconditioning, and fast matrix multiplication as well as recent work on
subspace embeddings and spectral sparsification that we hope will be of
independent interest.Comment: In an older version of this paper, we mistakenly claimed an improved
running time for Dikin walk by noting solely the improved running time for
linear system solving and ignoring the determinant computatio
Efficient Profile Maximum Likelihood for Universal Symmetric Property Estimation
Estimating symmetric properties of a distribution, e.g. support size,
coverage, entropy, distance to uniformity, are among the most fundamental
problems in algorithmic statistics. While each of these properties have been
studied extensively and separate optimal estimators are known for each, in
striking recent work, Acharya et al. 2016 showed that there is a single
estimator that is competitive for all symmetric properties. This work proved
that computing the distribution that approximately maximizes \emph{profile
likelihood (PML)}, i.e. the probability of observed frequency of frequencies,
and returning the value of the property on this distribution is sample
competitive with respect to a broad class of estimators of symmetric
properties. Further, they showed that even computing an approximation of the
PML suffices to achieve such a universal plug-in estimator. Unfortunately,
prior to this work there was no known polynomial time algorithm to compute an
approximate PML and it was open to obtain a polynomial time universal plug-in
estimator through the use of approximate PML. In this paper we provide a
algorithm (in number of samples) that, given samples from a distribution,
computes an approximate PML distribution up to a multiplicative error of
in time nearly linear in .
Generalizing work of Acharya et al. 2016 on the utility of approximate PML we
show that our algorithm provides a nearly linear time universal plug-in
estimator for all symmetric functions up to accuracy . Further, we show how to extend our work to provide
efficient polynomial-time algorithms for computing a -dimensional
generalization of PML (for constant ) that allows for universal plug-in
estimation of symmetric relationships between distributions.Comment: 68 page
Stability of the Lanczos Method for Matrix Function Approximation
The ubiquitous Lanczos method can approximate for any symmetric matrix , vector , and function . In exact arithmetic, the
method's error after iterations is bounded by the error of the best
degree- polynomial uniformly approximating on the range
. However, despite decades of work, it
has been unclear if this powerful guarantee holds in finite precision.
We resolve this problem, proving that when , Lanczos essentially matches the exact arithmetic
guarantee if computations use roughly bits of precision. Our
proof extends work of Druskin and Knizhnerman [DK91], leveraging the stability
of the classic Chebyshev recurrence to bound the stability of any polynomial
approximating .
We also study the special case of , where stronger guarantees
hold. In exact arithmetic Lanczos performs as well as the best polynomial
approximating at each of 's eigenvalues, rather than on the full
eigenvalue range. In seminal work, Greenbaum gives an approach to extending
this bound to finite precision: she proves that finite precision Lanczos and
the related CG method match any polynomial approximating in a tiny range
around each eigenvalue [Gre89].
For , this bound appears stronger than ours. However, we exhibit
matrices with condition number where exact arithmetic Lanczos
converges in iterations, but Greenbaum's bound predicts
iterations. It thus cannot offer significant improvement
over the bound achievable via our result. Our analysis raises
the question of if convergence in less than iterations can be
expected in finite precision, even for matrices with clustered, skewed, or
otherwise favorable eigenvalue distributions
Memory-Sample Tradeoffs for Linear Regression with Small Error
We consider the problem of performing linear regression over a stream of
-dimensional examples, and show that any algorithm that uses a subquadratic
amount of memory exhibits a slower rate of convergence than can be achieved
without memory constraints. Specifically, consider a sequence of labeled
examples with drawn independently from a
-dimensional isotropic Gaussian, and where for a fixed with and with
independent noise drawn uniformly from the interval
We show that any algorithm with at most bits of
memory requires at least samples to
approximate to error with probability of success at
least , for sufficiently small as a function of . In
contrast, for such , can be recovered to error with
probability with memory using
examples. This represents the first nontrivial lower bounds for regression with
super-linear memory, and may open the door for strong memory/sample tradeoffs
for continuous optimization.Comment: 22 pages, to appear in STOC'1
Parallel Reachability in Almost Linear Work and Square Root Depth
In this paper we provide a parallel algorithm that given any -node
-edge directed graph and source vertex computes all vertices reachable
from with work and depth with high
probability in . This algorithm also computes a set of edges
which when added to the graph preserves reachability and ensures that the
diameter of the resulting graph is at most . Our result
improves upon the previous best known almost linear work reachability algorithm
due to Fineman which had depth .
Further, we show how to leverage this algorithm to achieve improved
distributed algorithms for single source reachability in the CONGEST model. In
particular, we provide a distributed algorithm that given a -node digraph of
undirected hop-diameter solves the single source reachability problem with
rounds of the communication in
the CONGEST model with high probability in . Our algorithm is nearly optimal
whenever for any constant and is the
first nearly optimal algorithm for general graphs whose diameter is
for any constant .Comment: 38 pages. v2 fixes a small typo in Section 4 found by Aaron
Bernstein. v3 fixes some overflow issues. v4 fixes the proof of Lemma 5.1. We
thank Aaron Bernstein for pointing this ou
- …